Picture for Chengliang Xu

Chengliang Xu

BilliardPhys-Bench: Benchmarking Physical Reasoning and Visual Dynamics of Multimodal LLMs

Add code
May 29, 2026
Viaarxiv icon

CrystalXRD-Bench: Benchmarking Vision-Language Models for XRD Peak Indexing Across Diverse Crystalline Materials

Add code
May 28, 2026
Viaarxiv icon

FeynmanBench: Benchmarking Multimodal LLMs on Diagrammatic Physics Reasoning

Add code
Apr 04, 2026
Viaarxiv icon

SPM-Bench: Benchmarking Large Language Models for Scanning Probe Microscopy

Add code
Feb 26, 2026
Viaarxiv icon

HLE-Verified: A Systematic Verification and Structured Revision of Humanity's Last Exam

Add code
Feb 17, 2026
Viaarxiv icon